Conceptual introduction to k-means clustering
- How it works mechanistically, and considerations for analysis
- Application to time-series data
- Example of a study which used the method
10 November, 2023
This will also eventually (…) be a tutorial on https://jsteffman.github.io/teaching.html.
R package for traditional k-means clustering: Charrad, M., Ghazzali, N., Boiteau, V., Niknafs, A. (2014). NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set. Journal of Statistical Software, 61(6), 1-36. URL http://www.jstatsoft.org/v61/i06/.
R package for time-series k-means clustering: Genolini, C., Alacoque, X., Sentenac, M., Arnaud, C. (2015). kml and kml3d: R Packages to Cluster Longitudinal Data. Journal of Statistical Software, 65(4), 1-34. URL http://www.jstatsoft.org/v65/i04/.
For much more on this data see: Cole, J. & Steffman, J. & Shattuck-Hufnagel, S. & Tilsen, S. (2023) “Hierarchical distinctions in the production and perception of nuclear tunes in American English”, Laboratory Phonology 14(1). doi: https://doi.org/10.16995/labphon.9437
Various criteria: General intuition is the “best” partition optimizes within and between cluster variance
R packages for clustering allow for comparison across multiple criteria, can also evaluate by “majority vote”
In addition to this “best” k value, others may be of interest as informed by predictions or theory
Various criteria: General intuition is the “best” partition optimizes within and between cluster variance
R packages for clustering allow for comparison across multiple criteria, can also evaluate by “majority vote”
In addition to this “best” k value, others may be of interest as informed by predictions or theory